Adding Word Duration Information to Bigram Language Models
نویسندگان
چکیده
Suprasegmental information, while generally thought to play an important role in speech recognition by human listeners, has shown little promise in previous attempts to integrate into ASR systems. This paper outlines an approach that will successfully exploit suprasegmental information by modeling duration within the context of N-gram language modeling. Results show that up to half of the variance in wordlevel timing can be explained in terms of a simple bigram duration model. These experiments were conducted using the Switchboard corpus of conversational speech over the telephone. The paper also outlines a way of augmenting the N-gram language model with suprasegmental information.
منابع مشابه
Integrating Large Context Language Models into a Real Time Word Recognizer
In this paper we present a new recognizer architecture that allows the eecient integration of language models with arbitrary large context information, e.g. polygram models, into the recognition process. Instead of using these models for rescoring the n best word chains generated using bigram information, we extract the best word chain, or optionally the n best word chains, directly from the wo...
متن کاملInterpolated Distanced Bigram Language Models for Robust Word Clustering
Two methods for interpolating the distanced bigram language model are examined which take into account pairs of words that appear at varying distances within a context. The language models under study yield a lower perplexity than the baseline bigram model. A word clustering algorithm based on mutual information with robust estimates of the mean vector and the covariance matrix is employed in t...
متن کاملWord Pairs in Language Modeling for Information Retrieval
Previous language modeling approaches to information retrieval have focused primarily on single terms. The use of bigram models has been studied, but the restriction on word order and adjacency may not be justified for information retrieval. We propose a new language modeling approach to information retrieval that incorporates lexical affinities, or pairs of words that occur near each other, wi...
متن کاملUsing a stochastic context-free grammar as a language model for speech recognition
This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous continuous-speech understanding system (Jurafsky et al. 1994). We describe an algorithm for using a probabilistic Earley parser and a stochastic context-free grammar (SCFG) to generate word transition prob...
متن کاملSubword lexical modelling for speech recognition
In this work, we introduce and develop a novel framework, angie, for modelling subword lexical phenomena in speech recognition. Our framework provides a exible and powerful mechanism for capturing morphology, syllabi cation, phonology, and other subword e ects in a hierarchical manner which maximizes sharing of subword structures. Angie models the subword structure within a context-free grammar...
متن کامل